Technical report : OpenMaTrEx , a free , open - source hybrid data - driven machine translation system ∗
نویسندگان
چکیده
This report describes OpenMaTrEx, a free/open-source hybrid data-driven machine translation system containing core example-based components based on the marker hypothesis. OpenMaTrEx comprises a marker-driven chunker, a collection of chunk aligners, tools to merge (“hybridise”) marker-based and statistical translation tables, two engines —a simple proof-of-concept monotone “example-based” recombination engine and a statistical decoder based on Moses —, and support for automatic evaluation. It also contains support for “word packing” to improve alignment. OpenMaTrEx is a free/open-source release of basic components of MaTrEx, the Dublin City University machine translation system. The components and processes implemented in OpenMaTrEx are described in both theoretical and functional detail. Additionally, experimental results are shown in which OpenMaTrEx is compared to plain statistical machine translation on representative tasks.
منابع مشابه
OpenMaTrEx: A Free/Open-Source Marker-Driven Example-Based Machine Translation System
We describe OpenMaTrEx, a free/open-source examplebased machine translation (EBMT) system based on the marker hypothesis, comprising a marker-driven chunker, a collection of chunk aligners, and two engines: one based on a simple proof-of-concept monotone EBMT recombinator and a Moses-based statistical decoder. OpenMaTrEx is a free/open-source release of the basic components of MaTrEx, the Dubli...
متن کاملDeeper Machine Translation and Evaluation for German
This paper describes a hybrid Machine Translation (MT) system built for translating from English to German in the domain of technical documentation. The system is based on three different MT engines (phrase-based SMT, RBMT, neural) that are joined by a selection mechanism that uses deep linguistic features within a machine learning process. It also presents a detailed source-driven manual error...
متن کاملOpenMT: Open Source Machine Translation Using Hybrid Methods
The main goal of the OpenMT project is the development of open source machine translation architectures based on hybrid models and advanced syntactic–semantic processors. These architectures combine the three main Machine Translation (MT) frameworks, Rule-based (RBMT), Statistical (SMT) and Example–based (EBMT), into hybrid systems. Defined architectures and results will be open source, allow f...
متن کاملA Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملMulti-Engine Machine Translation with an Open-Source Decoder for Statistical Machine Translation
We describe an architecture that allows to combine statistical machine translation (SMT) with rule-based machine translation (RBMT) in a multi-engine setup. We use a variant of standard SMT technology to align translations from one or more RBMT systems with the source text. We incorporate phrases extracted from these alignments into the phrase table of the SMT system and use the open-source dec...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011